Exploration and Exploitation in Parkinson’s Disease: Behavioral Analyses

Authors
Affiliations

Björn Meder

Health and Medical University, Potsdam, Germany

Martha Sterf

Medical School Berlin, Berlin, Germany

Charley M. Wu

University of Tübingen, Tübingen, Germany

Matthias Guggenmos

Health and Medical University, Potsdam, Germany

Published

July 25, 2025

Code
# Housekeeping: Load packages and helper functions
# Housekeeping
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(message = FALSE)
knitr::opts_chunk$set(warning = FALSE)
knitr::opts_chunk$set(fig.align='center')
knitr::opts_chunk$set(prefer_html = TRUE)

options(knitr.kable.NA = '')
options(brms.backend = "cmdstanr") # more stability for M1 Mac

packages <- c('gridExtra', 'BayesFactor', 'tidyverse', "RColorBrewer", "lme4", "sjPlot", "lsr", "brms", "kableExtra", "afex", "emmeans", "viridis", "ggpubr", "hms", "scales", "cowplot", "gtsummary", "webshot", "webshot2", "parameters", "bridgesampling", "cmdstanr")
lapply(packages, require, character.only = TRUE)

set.seed(0815)

# file with various statistical functions, among other things it provides tests for Bayes Factors (BFs)
source('statisticalTests.R')

# Wrapper for brm models such that it saves the full model the first time it is run, otherwise it loads it from disk
run_model <- function(expr, modelName, path='brm', reuse = TRUE) {
  path <- paste0(path,'/', modelName, ".brm")
  if (reuse) {
    fit <- suppressWarnings(try(readRDS(path), silent = TRUE))
  }
  if (is(fit, "try-error")) {
    fit <- eval(expr)
    saveRDS(fit, file = path)
  }
  fit
}

# Setting some plotting params
w_box          <- 0.2      # width of boxplot, also used for jittering points and lines    
line_jitter    <- w_box / 2
xAnnotate      <- -0.3

# jitter params
jit_height  <- 0.01
jit_width   <- 0.05
jit_alpha   <- 0.6

# colors 
groupcolors    <- c("#7570b3", "#1b9e77", "#d95f02")
choice3_colors <- c("#e7298a", "#66a61e", "#e6ab02")
Code
########################################################
# get behavioral data
########################################################
dat_gridsearch <- read_csv("data/data_gridsearch_Parkinson.csv", show_col_types = FALSE) %>% 
  mutate(type_choice  = factor(type_choice, levels = c("Repeat", "Near", "Far"))) 

# normalize reward and previous reward
dat_gridsearch$z = dat_gridsearch$z / 50
dat_gridsearch$previous_reward = dat_gridsearch$previous_reward / 50

########################################################
# get bonus round data
########################################################
dat_bonus   <- read_csv(file="data/data_gridsearch_Parkinson_bonusround.csv") %>% 
  mutate(bonus_environment = as.factor(bonus_environment))

########################################################
# get subject data
########################################################
dat_sample <- read_delim("data/data_gridsearch_subjects.csv", escape_double = FALSE, trim_ws = TRUE, show_col_types = FALSE) %>% 
  mutate(gender = as.factor(gender),
        # group  = factor(group, levels = c("Control", "PD+", "PD-"))) %>% 
         #group  = factor(group, levels = c("Control", "PD+", "PD-"), labels = c("Control", "PD+", "PD-"))) %>% 
        group = fct_recode(group,
                       "Control" = "PNP",
                       "PD+"     = "PD+",
                       "PD-"     = "PD-"),
    group = fct_relevel(group, "Control", "PD+", "PD-")
  ) %>% 
  mutate(last_ldopa = if_else(group != "Control", as_hms(last_ldopa), as_hms(NA)),
         next_ldopa = if_else(group != "Control", as_hms(next_ldopa), as_hms(NA)),
         time_exp = if_else(group != "Control", as_hms(time_exp), as_hms(NA))) %>% 
  mutate(time_since_ldopa = as.numeric(time_exp - last_ldopa, unit = "mins"))

# combine behavioral and subject data
dat <- dat_sample %>% 
  left_join(dat_gridsearch, by = "id") %>% 
  arrange(group)

1 Abstract

We investigated how patients with Parkinson’s disease (PD) balance the explore-exploit trade-off using a spatially correlated bandit task, where the spatial structure of rewards facilitated value generalization (i.e., nearby options yield similar rewards). Participants were tested either shortly after taking their regular Levodopa (L-Dopa) dose (N=33) or just before their next scheduled dose (N=32). Patients with polyneuropathy served as a control group (N=34), comparable in age, depressive symptoms, and basic cognitive functioning. Behavioral and computational analyses revealed distinct patterns of exploration and exploitation. PD patients on L-Dopa balanced exploration and exploitation, though not as efficiently as polyneuropathy patients. In stark contrast, patients off L-Dopa rarely exploited known high-value options and primarily explored novel ones. This overreliance on exploration impaired their ability to navigate the explore-exploit trade-off and maximize rewards. To better understand the mechanisms underlying these behavioral differences, we employed a computational approach using the Gaussian Process Upper Confidence Bound (GP-UCB) model. This model integrates similarity-based generalization with two distinct exploration mechanisms: directed exploration, which seeks to reduce uncertainty about rewards, and random exploration, which introduces stochastic variability in choice behavior. The model parameters showed that behavioral differences between the on- and off-medication conditions were primarily driven by differences in uncertainty-directed exploration, while the level of random exploration remained unchanged. Both PD groups showed reduced generalization compared to controls, contributing to poorer overall performance. Our findings indicate that L-Dopa selectively modulates uncertainty-directed exploration, providing a more nuanced understanding of the central role of dopamine in the regulation of exploratory behavior.

2 Intro

A central distinction between different forms of exploration behavior is that directed exploration reflects the drive for knowledge about novel options, whereas undirected exploration refers to random variability in the choice process (Giron et al., 2023; Meder et al., 2021; Sadeghiyeh et al., 2020; Schulz et al., 2019; Wu et al., 2018).

[TO BE CONTINUED]

3 Experiment

We investigated how patients with Parkinson’s disease (PD) manage the explore-exploit trade-off using a spatially correlated multi-armed bandit task. Participants accumulated rewards by selecting tiles (options) with normally distributed rewards. The spatial correlation between rewards facilitated generalization, allowing participants to adapt to the structure of the environment and balance exploring new options versus exploiting known high-reward options.

Screenshot from experiment, procedure and example environments

Screenshot from experiment, procedure and example environments

3.1 Materials and procedure

40 distinct environments were generated using a radial basis function kernel with \(\lambda = 1\), creating a bivariate reward function on a grid that maps each tile location to a specific reward value. These reward functions gradually varied across the grid, creating environments with spatially-correlated rewards. The correlation between neighboring options is about r≈0.6, exponentially decreasing with distance.

Participants completed 10 rounds of the task, each featuring a new environment drawn without replacement from the set of 40 environments. In each round, participants had 25 choices to accumulate rewards. The first round served as a tutorial to familiarize participants with the task and was excluded from the analyses. The final round (round 10) was a bonus round where, after 15 choices, participants were asked to predict rewards for five unrevealed options. Data from this round were also excluded from the main analysis and analyzed separately.

At the start of each round, one tile was randomly revealed, and participants sequentially sampled 25 tiles. On each trial, they could choose to either click a new tile or re-click a previously selected tile. Selections were made by selecting the tile on the computer screen using a mouse, upon which they received a reward arbitrarily scaled to the range [0,50]. Re-clicked tiles showed small variations in reward due to normally distributed noise.

3.2 Sample

We collected data from adult participants with Parkinson’s disease (PD) who regularly receive Levodopa (L-Dopa) for symptomatic treatment (Abbott, 2010; Tambasco et al., 2018). Participants were recruited via a neurologist’s outpatient practice. Eligible participants were evaluated based on Hoehn-Yahr scores recorded in their patient files. The scale assesses disease severity and motor impairments based on a score from 1 to 5, with higher scores indicating greater severity (Goetz et al., 2004; Hoehn & Yahr, 1967). We limited recruitment to individuals with scores between 1 and 3, as scores of 4 and 5 reflect severe impairment.

PD patients were randomly assigned to two conditions: on medication (PD+) and off medication (PD-). In the PD+ group (N=33), patients’ scheduled L-Dopa was administered at least 30 minutes before the start of the experiment. In the PD- group (N=32), the next scheduled dose for participants was timed such that they were in a low dopamine state during the experiment, offering a clear contrast to the PD+ group. Thus, we refer to the ‘on medication’ condition as the state after taking L-Dopa and the ‘off medication’ condition as the state before their next scheduled dose.

The comparison group (N=34) consisted of individuals of similar age diagnosed with polyneuropathies (Control), a condition affecting the peripheral nervous system that can lead to physical symptoms such as pain, sensory loss, or motor weakness. However, unlike Parkinson’s disease, Control does not involve central dopaminergic dysfunction or cognitive impairment.

3.3 Clinical assessment

To characterize participants’ clinical status, we employed standardized measures assessing Parkinson’s disease severity, basic cognitive function, and depressive symptoms. PD severity was evaluated using the Hoehn-Yahr scale, which rates motor impairments such as postural instability and gait difficulties (Hoehn & Yahr, 1967). Participants can receive a score between one and five, with higher scores indicating more severe problems. Basic cognitive function of all participants was assessed through the Mini-Mental State Examination (MMSE), which is frequently used in in patients with dementia (Folstein et al., 1975). The test comprises 30 questions pertaining to different domains, including memory (e.g., recalling three objects), temporal and spatial orientation (e.g., date and location), and arithmetic ability. Finally, all participants answered the German version of the Beck Depression Inventory II, a self-report inventory consisting of 21 items measuring depressive symptoms (Beck et al., 1996; Hautzinger et al., 2006).

3.4 Sample characteristics

Note

exclude participant with pump

Table 1 shows the demographics of our sample, along with their Hoehn-Yahr, MMSE , and BDI scores. In the PD+ group, the mean time since their last L-Dopa dose was 104 min; in the PD- group it was 253min.

Code
dat_sample %>% 
  group_by(group) %>% 
  summarise(n = n(),
            female = sum(gender == "f"),
            mean_age = mean(age),
            sd_age = sd(age),
            mean_BDI = mean(BDI, na.rm= T), 
            mean_MMSE = mean(MMSE, na.rm= T),
            mean_HY = mean(hoehn_yahr, na.rm= T),
            mean_time_since_ldopa = mean(time_since_ldopa,  na.rm= T)) %>% 
  
  kable(., format = "html", escape = FALSE, digits = 1) %>%
  kable_styling("striped", full_width = FALSE)
Table 1
group n female mean_age sd_age mean_BDI mean_MMSE mean_HY mean_time_since_ldopa
Control 34 13 65.4 7.1 8.4 28.9
PD+ 33 11 64.1 6.2 8.5 29.1 1.9 104.4
PD- 32 9 66.7 6.5 8.3 28.8 1.9 253.3

Demographics and clinical assessment.

4 Behavioral data

All behavioral data are stored in data_gridsearch_parkinson.csv, which contains the following variables (Table 2):

  • id: participant id
  • age is participant age in years
  • gender: (m)ale, (f)emale, (d)iverse
  • x and y are the sampled coordinates on the grid
  • chosen: are the x and y coordinates of the chosen tile
  • z is the reward obtained from the chosen tile, normalized to the range 0-1. Re-clicked tiles could show small variations in the observed color (i.e., underlying reward) due to normally distributed noise,\(\epsilon∼N(0,1)\).
  • z_scaled is the observed outcome (reward), scaled in each round to a randomly drawn maximum value in the range of 70% to 90% of the highest reward value
  • trial is the trial number (0-25), with 0 corresponding to the initially revealed random tile, i.e. trial 1 is the first choice
  • round is the round number (1 through 10), with 1=practice round (not analyzed) and 10=bonus round (analyzed only for bonus round judgments)
  • distance is the Manhattan distance between consecutive clicks. NA for trial 0, the initially revealed random tile
  • type_choice categorizes consecutive clicks as “repeat” (clicking the same tile as in the previous round), “near” (clicking a directly neighboring tile, i.e. distance=1), and “far” (clicking a tile with distance > 1). NA for trial 0, i.e., the initially revealed random tile.
  • previous_reward is the reward z obtained on the previous step. NA for trial 0, i.e., the initially revealed random tile.
  • last_ldopa: time of the last L-Dopa dose (HH:MM)
  • next_ldopa: scheduled time of the next L-Dopa dose (HH:MM)
  • time_exp: time of the experiment (HH:MM)
  • time_since_ldopa: time since last L-Dopa (in minutes)

We analyzed the behavioral data in terms of performance and exploration behavior. These analyses exclude the tutorial and bonus rounds, leaving a total of 200 search decisions (8 rounds \(\times\) 25 trials) for each participant. We then report the results of the bonus round, where we analyze participants’ reward predictions and confidence judgments. We report both frequentist statistics and Bayes factors (\(BF\)) to quantify the relative evidence of the data in favor of the alternative hypothesis (\(H_A\)) over the null hypothesis (\(H_0\)); see Appendix for details and references. Various helper functions are implemented in statisticalTests.R. Regression analyses were performed in a Bayesian framework with Stan, accessed via R-package brms, complemented by frequentist hierarchical regression analyses (via package lmer).

Code
# show example data
head(dat %>%
       group_by(group) %>%
       slice_head(n=25)  %>% 
       ungroup()) %>%
  kable("html", caption = "Example behavioral data.") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE) %>% 
  scroll_box(width = "100%", height = "300px")
Table 2
Example behavioral data.
id age gender group BDI MMSE hoehn_yahr last_ldopa next_ldopa time_exp time_since_ldopa session x y chosen z zscaled time trial round distance type_choice previous_reward
111 54 f Control 12 30 1 3 3 28 0.38 18 1.719326e+12 0 1
111 54 f Control 12 30 1 3 3 28 0.38 18 1.719326e+12 1 1 0 Repeat 0.38
111 54 f Control 12 30 1 3 4 36 0.52 22 1.719326e+12 2 1 1 Near 0.38
111 54 f Control 12 30 1 4 5 45 0.64 26 1.719326e+12 3 1 2 Far 0.52
111 54 f Control 12 30 1 5 6 54 1.00 38 1.719326e+12 4 1 2 Far 0.64
111 54 f Control 12 30 1 6 7 63 0.26 14 1.719326e+12 5 1 2 Far 1.00

Example data.

4.1 Performance: Rewards by round

Code
# mean reward per subject (practice and bonus round excluded)
df_mean_reward_subject_by_round <- dat %>% 
  filter(trial != 0 & round %in% 2:9) %>% # exclude first (randomly revealed) tile and practice round and bonus round
  group_by(id, round) %>% 
  summarise(age = mean(age),
            group = first(group),
            sum_reward = sum(z),
            mean_reward = mean(z), 
            sd_reward = sd(z)) 

df_summary_by_round <- df_mean_reward_subject_by_round %>%
  group_by(round, group) %>%
  summarize(
    mean_of_means = mean(mean_reward, na.rm = TRUE),  # Renaming to avoid confusion
    se_reward = sd(mean_reward, na.rm = TRUE) / sqrt(n()),  # Standard error
    .groups = 'drop'
  )

aov_rounds <- aov_ez(
  id = "id",                 
  dv = "mean_reward",        
  within = "round",          
  between = "group",         
  data = df_mean_reward_subject_by_round
)


# kable(as.data.frame(aov_rounds$anova_table), 
#       format = "html", escape = FALSE, digits = 2, 
#       caption = "ANOVA results with round as within-subjects factor and group as between subjects factor, where rewards per round were first aggregated within subjects.") %>%
#   kable_styling("striped", full_width = FALSE)


brm_rounds <- run_model(
  expr = quote(brm(
    mean_reward ~ round * group + (1 | id),
    data = df_mean_reward_subject_by_round,
    family = gaussian(),
    iter = 4000,
    warmup = 1000,
    chains = 4,
    cores = 4,
    seed = 0511,
    backend = "cmdstanr",
    save_pars = save_pars(all = TRUE)
  )),
  modelName = 'brm_reward_rounds'
)

# Extract fitted values and add to data df
fitted_values <- fitted(brm_rounds, re_formula = NA)
df_mean_reward_subject_by_round$fitted_mean_reward <- fitted_values[, "Estimate"]

p <- ggplot(df_mean_reward_subject_by_round, aes(x = round, y = mean_reward, group = group, shape = group, color = group)) +
  geom_point(data = df_summary_by_round, aes(x = round, y = mean_of_means, shape = group), size = 3) +
  geom_line(aes(y = fitted_mean_reward), linewidth = 1) +  
  geom_jitter(aes(x = round, y = mean_reward), size = 1, alpha = 0.3, width = 0.2) +
  scale_y_continuous("Mean Reward", breaks = c(25,30,35)) +
  xlab("Round") +
  scale_fill_manual(values = groupcolors) +
  scale_color_manual(values = groupcolors) +
  ggtitle("Mean Reward by Rounds and Group (brms)") +
  theme_classic() +
  theme(legend.title = element_blank())

# tbl_regression(brm_rounds, exponentiate = F) 
#tab_model(brm_rounds)

# Reduced model for computing BF: no round term
brm_rounds_reduced <- run_model(brm(
  mean_reward ~ group + (1 | id),
  data = df_mean_reward_subject_by_round,
  family = gaussian(),
  iter = 4000, 
  warmup = 1000, 
  chains = 4, 
  cores = 4, 
  seed = 0511,
  backend = "cmdstanr",
  save_pars = save_pars(all = TRUE)),
modelName='brm_reward_rounds_reduced')

# Compute Bayes Factor: Full vs. Reduced (without round term)
#bf_brm_rounds <- bayes_factor(brm_rounds, brm_rounds_reduced)


# format_parameters(brm_rounds)
# 
# params <- model_parameters(brm_rounds) |> print_md()
# params$Parameter <- gsub("groupPDP", "PD+", params$Parameter)
# params$Parameter <- gsub("groupPDP", "PD+", params$Parameter)
# params$Term <- gsub("round", "Round", params$Term)
# 
# 
# params$Term <- gsub("groupPDP", "PD", params$Term)

# plot_model(brm_rounds, type = "est") +
#   theme_classic()

Figure 1 shows the obtained rewards by round, for each group. An ANOVA with round as within- and group as between-subjects factor showed a difference between groups, with Control patients achieving the greatest rewards, followed by PD+ and PD- patients, but no change across rounds (and no interaction). A Bayesian regression analysis yielded comparable results, with the estimated effect of round on mean reward being very small (Estimate = 0, 95% CI [-0.01, 0.01]) and the credible interval including zero. In the subsequent analyses, we therefore aggregate across rounds.

Code
# Plot the mean reward by round for each group with dodged points and error bars
ggplot(df_summary_by_round, aes(x = round, y = mean_of_means, group = group, shape = group,  color = group, fill = group)) +
  geom_line(position = position_dodge(width = 0.3)) +  
  # geom_errorbar(aes(ymin = mean_of_means - 1.96 * se_reward, ymax = mean_of_means + 1.96 * se_reward), width = 0.2, position = dodge, color = "black") +  # 
  geom_errorbar(aes(ymin = mean_of_means - se_reward, ymax = mean_of_means + se_reward), width = 0.2, position = position_dodge(width = 0.3), alpha=0.7) +  #
  geom_point(position = position_dodge(width = 0.3), size = 3, stroke = 1, alpha=.9) +  
  #coord_cartesian(ylim = c(20,40)) +
  coord_cartesian(ylim = c(0.4,0.75)) +
  #scale_shape_manual(values = c(21, 24, 22)) +  # circle, triangle, and square
  scale_fill_manual(values = groupcolors) +  
  scale_color_manual(values = groupcolors) +  
  #scale_color_manual(values = c("black","black","black")) +  
  # scale_y_continuous("Mean reward ± 95% CI") +
  scale_y_continuous("Mean reward ± SE") +
  scale_x_continuous("Round", breaks = 2:9) +
  theme_classic() +
  theme(legend.title = element_blank(),
      plot.title = element_text(size=16),
      axis.text = element_text(size=14),
      axis.title = element_text(size=14))

ggsave("plots/performance_rounds.png", width = 6, height = 3)
Figure 1: Performance over rounds (excluding tutorial and bonus round).

4.2 Performance: Rewards by group

Code
df_mean_reward_subject <- dat %>% 
  filter(trial != 0 & round %in% 2:9) %>% # exclude first (randomly revealed) tile and practice round and bonus round
  group_by(id) %>% 
  summarise(age = mean(age),
            group = first(group),
            sum_reward = sum(z),
            mean_reward = mean(z), 
            sd_reward = sd(z),
            BDI = first(BDI),          
            MMSE = first(MMSE),  
            hoehn_yahr = first(hoehn_yahr))

# some summary stats for obtained mean rewards
# df_mean_reward_subject %>%
#   group_by(group) %>%
#   summarise(n = n(),
#             m_reward = mean(mean_reward),
#             md_reward = median(mean_reward),
#             var_reward = var(mean_reward),
#             sd_reward = sd(mean_reward),
#             se_reward = sd_reward / sqrt(n),
#             lower_ci_reward = m_reward - qt(1 - (0.05 / 2), n - 1) * se_reward,
#             upper_ci_reward = m_reward + qt(1 - (0.05 / 2), n - 1) * se_reward) %>%
#   
#   kable(., format = "html", escape = FALSE, digits = 2) %>%
#   kable_styling("striped", full_width = FALSE)

Figure 2 shows the overall performance of each group, based on each subject’s mean reward across all trials. Control participants achieved higher rewards than both PD patients on medication (\(t(65)=2.5\), \(p=.014\), \(d=0.6\), \(BF=3.5\)) and off medication (\(t(63)=7.2\), \(p<.001\), \(d=1.8\), \(BF>100\)). Notably, PD patients on medication achieved substantially higher rewards than patients off medication (\(t(62)=5.9\), \(p<.001\), \(d=1.5\), \(BF>100\)), indicating as strong beneficial effect of L-Dopa on the ability to balance exploration and exploitation.

  • PD+ vs. PD-: \(t(62)=5.9\), \(p<.001\), \(d=1.5\), \(BF>100\)
  • Control vs. PD-: \(t(63)=7.2\), \(p<.001\), \(d=1.8\), \(BF>100\)
  • Control vs. PD+:\(t(65)=2.5\), \(p=.014\), \(d=0.6\), \(BF=3.5\)
Code
# Boxplots of rewards by group
p_performance <- ggplot(df_mean_reward_subject, aes(x = group, y = mean_reward, color = group, fill = group, shape = group)) +
  geom_hline(yintercept=.5, linetype='dashed', color='black') + # random choice model (mean across all 40 environments)
  geom_boxplot(alpha = 0.2, outlier.shape = NA, width = 0.4) +  
  geom_jitter(width = 0.15, size = 2, alpha = 0.8) +  
  stat_summary(fun = mean, geom = "point", shape = 23, fill = "white", size = 4) +  
  scale_color_manual(values = groupcolors) +
  scale_fill_manual(values = groupcolors) +
  ylab("Mean normalized reward") +
  xlab("") +
  ggtitle("Performance") +
  theme_classic() +
  theme(
    legend.position = 'none',
    legend.title = element_blank(),
      plot.title = element_text(size=24),
      axis.text = element_text(size=18),
      axis.title = element_text(size=18)
  )

p_performance
ggsave("plots/performance.png", p_performance, dpi=300, height = 5, width = 6 )

# Control vs. PD+
# ttestBF(subset(df_mean_reward_subject, group == 'Control')$mean_reward, subset(df_mean_reward_subject, group == 'PD+')$mean_reward, var.equal = TRUE)
# t.test(subset(df_mean_reward_subject, group == 'Control')$mean_reward, subset(df_mean_reward_subject, group == 'PD+' )$mean_reward, var.equal = T)

# Control vs. PD-
# ttestBF(subset(df_mean_reward_subject, group == 'Control')$mean_reward, subset(df_mean_reward_subject, group == 'PD-')$mean_reward, var.equal = TRUE)
# t.test(subset(df_mean_reward_subject, group == 'Control')$mean_reward, subset(df_mean_reward_subject, group == 'PD-' )$mean_reward, var.equal = T)

# PPD vs. PD-
# ttestBF(subset(df_mean_reward_subject, group == 'PD+')$mean_reward, subset(df_mean_reward_subject, group == 'PD-')$mean_reward, var.equal = TRUE)
# t.test(subset(df_mean_reward_subject, group == 'PD+')$mean_reward, subset(df_mean_reward_subject, group == 'PD-' )$mean_reward, var.equal = T)

# Plot for Computational Psychiatry Conference (Tübingen, July 2025)
# p_performance_by_group_CPP <- ggplot(df_mean_reward_subject, aes(x = group, y = mean_reward, color = group, fill = group, shape = group)) +
#   #geom_hline(yintercept=.5, linetype='dashed', color='red') + # random choice model (mean across all 40 environments)
#   geom_boxplot(alpha = 0.2, outlier.shape = NA, width = 0.5) +  
#   geom_jitter(width = 0.15, size = 2, alpha = 0.8) +  
#   stat_summary(fun = mean, geom = "point", shape = 23, fill = "white", size = 2) +  
#   scale_color_manual(values = groupcolors) +
#   scale_fill_manual(values = groupcolors) +
#   ylab("Mean normalized reward") +
#   xlab("") +
#   ggtitle("Performance") +
#   theme_classic() +
#   theme(
#     strip.background = element_blank(),
#     strip.text = element_text(color = "black", size = 20),
#     legend.position = 'none',
#     legend.title = element_blank(),
#       plot.title = element_text(size=24),
#       axis.text = element_text(size=20),
#       axis.title = element_text(size=20)
#   )
# 
# ggsave("plots/performance_by_group_CPP.png", p_performance_by_group_CPP, dpi=300, height = 5, width = 6 )
# ggsave("plots/performance_by_group_CPP.pdf", p_performance_by_group_CPP, height = 5, width = 6 )
Figure 2: Obtained rewards by group. Each dot is one participants’ mean reward across all rounds and trials.

4.3 Performance: Learning curves

Participants’ learning curves (Figure 3) show the average reward obtained in each trial across rounds. For both polyneuropathy patients (Control) and PD patients on medication (PD+), the mean rewards increased as the round progresses, suggesting they effectively balanced exploration and exploitation to maximize rewards. In stark contrast, PD patients off medication (PD-) showed no improvement across trials.

TO DO: Add random reward as baseline

Figure 3: Learning curves, showing obtained mean reward for each trial, aggregated across rounds.

4.3.1 Performance: Role of physiological and cognitive assessments (BDI, MMSE, Hoehn-Yahr)

We also assessed patients in terms of their depressive symptoms (via BDI-II), cognitive functioning (via Mini-Mental-Status Examination, MMSE), and severity of motor symptoms (via Hoehn-Yahr scale, Parkinson’s disease patients only). We ran a hierarchical regression with reward as dependent variable and group, BDI score, and MMSE score; with random intercepts for participants to account for individual differences. This analysis yielded only an effect of group, suggesting that BDI and MMSE score were not related to performance.

TO DO: Keep eye on MMSE scores, approaching significance

Code
# Hierarchical frequentist regression with random intercept: Reward as function of BDI and MMSE score (all patients)    
lmer_performance_BDI_MMSE <- lmer(z ~ group + BDI + MMSE + (1 | id), 
                                  data = subset(dat, trial > 0 & round %in% 2:9))

#summary(lmer_reward_BDI_MMSE)
tab_model(lmer_performance_BDI_MMSE, title = "Hierarchical regression results: Performance as function of BDI and MMSE score.", bpe="mean")
Hierarchical regression results: Performance as function of BDI and MMSE score.
  z
Predictors Estimates CI p
(Intercept) 0.27 -0.20 – 0.75 0.264
group [PD+] -0.05 -0.08 – -0.01 0.005
group [PD-] -0.12 -0.15 – -0.09 <0.001
BDI 0.00 -0.00 – 0.00 0.869
MMSE 0.01 -0.00 – 0.03 0.100
Random Effects
σ2 0.06
τ00 id 0.00
ICC 0.06
N id 97
Observations 19400
Marginal R2 / Conditional R2 0.037 / 0.094
Code
# Hierarchical Bayesian regression with random intercept: Reward as function of BDI and MMSE score (all patients)                              
brm_performance_BDI_MMSE <- run_model(brm(z ~ group + BDI + MMSE + (1|id),
                                          data=subset(dat, trial > 0 & round %in% 2:9 ),
                                          chains = 4,                               
                                          cores = 4,                                
                                          save_pars = save_pars(all = TRUE),
                                          seed = 0815,
                                          iter = 5000,
                                          warmup=1000,
                                          backend = "cmdstanr",
                                          control = list(adapt_delta = 0.99, max_treedepth = 15)),
                                      #prior = prior(normal(0,10), class = "b")),
                                      modelName = 'brm_performance_BDI_MMSE')
#tab_model(brm_performance_assessment, bpe="mean", title = "Hierarchical Bayesian regression: Performance as function of BDI and MMSE score.") 
#bayes_R2(brm_performance_assessment) 
#tab_model(lmer_performance_BDI_MMSE, brm_performance_BDI_MMSE, title = "Hierarchical regression results: Performance as function of BDI and MMSE score.", bpe="mean")

Next, we ran a hierarchical regression for Parkinson’s patients only, with reward as dependent variable and group, BDI, MMSE, and Hoehn-Yahr score as predictors; with random intercepts for participants to account for individual differences. This analysis only yielded an influence of group, i.e. being on or off L-Dopa.

Code
# Hierarchical frequentist regression with random intercept: Reward as function of BDI, MMSE, and Hoehner-Yahr score (Parkinson's patients only)   
lmer_reward_PD_only_BDI_MMSE_HY <- lmer(z ~ group + BDI + MMSE + hoehn_yahr + (1 | id), 
                                        data = subset(dat, trial > 0 & round %in% 2:9 & group != "Control"))

#summary(lmer_reward_PD_only_BDI_MMSE_HY)

tab_model(lmer_reward_PD_only_BDI_MMSE_HY, title = "Hierarchical regression results: Performance of patients with Parkinson's disease as function of BDI, MMSE, and Hoehn-Yahr score.",  bpe="mean")
Hierarchical regression results: Performance of patients with Parkinson's disease as function of BDI, MMSE, and Hoehn-Yahr score.
  z
Predictors Estimates CI p
(Intercept) 0.45 -0.07 – 0.98 0.093
group [PD-] -0.08 -0.10 – -0.05 <0.001
BDI 0.00 -0.00 – 0.01 0.157
MMSE 0.00 -0.01 – 0.02 0.607
hoehn yahr 0.01 -0.01 – 0.03 0.456
Random Effects
σ2 0.06
τ00 id 0.00
ICC 0.04
N id 64
Observations 12800
Marginal R2 / Conditional R2 0.025 / 0.063
Code
# Hierarchical Bayesian regression with random intercept: Reward as function of BDI, MMSE, and Hoehner-Yahr score (Parkinson's patients only)                          
brm_performance_PD_only_BDI_MMSE_HY <- run_model(brm(z ~ group + BDI + MMSE + (1|id),
                                                     data=subset(dat, trial > 0 & round %in% 2:9 & group != "Control"),
                                                     cores=4,
                                                     chains = 4,
                                                     seed = 0815,
                                                     iter = 5000,
                                                     warmup=1000,
                                                     backend = "cmdstanr",
                                                     control = list(adapt_delta = 0.99)),
                                                 #prior = prior(normal(0,10), class = "b")),
                                                 modelName = 'brm_performance_PD_only_BDI_MMSE_HY')

# tab_model(lmer_reward_PD_only_BDI_MMSE_HY, brm_performance_PD_only_BDI_MMSE_HYtitle = "Hierarchical regression results: Performance of patients with Parkinson's disease as function of BDI, MMSE, and Hoehn-Yahr score.",  bpe="mean")

4.4 Exploration vs. exploitation choices

To investigate the temporal dynamics of exploration and exploitation, we determined for each trial whether the chosen tile was novel (an exploration decision) or had already been selected previously (an exploitation decision). Intuitively, at the beginning of each round learners should predominantly engage in exploration to identify high-reward options, and gradually shift toward exploitative behavior as they approach the end of the round.

Code
# proportion of unique choices per round per subject
df_unique_choices_round <- 
  dat %>%
  filter(round %in% 2:9 & trial > 0) %>% 
  group_by(id,group, round) %>%  
  summarize(
    total = n(),  #  number of trials
    unique_tiles = n_distinct(x, y),  # unique (x, y) combinations (i.e., tiles)
    repeat_tiles = total - unique_tiles
  ) %>% 
  mutate(prop_unique = unique_tiles/total,
         prop_repeat = repeat_tiles/total) 

# proportion of unique choices across 8 rounds per subject
df_unique_choices_subject <- df_unique_choices_round %>% 
  group_by(id, group) %>% 
  summarize(m_prop_unique = mean(prop_unique),
            m_prop_repeat = mean(prop_repeat))

dat <- dat %>%
  group_by(id, round) %>%
  arrange(trial, .by_group = TRUE) %>%  # Ensure data is sorted by trial
  mutate(
    is_new = factor(if_else(!duplicated(chosen), "new", "repeat")),  # Check uniqueness based on 'chosen' column
    is_new_label = factor(if_else(!duplicated(chosen), "Exploration", "Exploitation"), levels = c("Exploration", "Exploitation"))
  ) %>%
  ungroup()

dat_repeat_prop <- dat %>%
  filter(trial > 0 & round %in% 2:9) %>% 
  group_by(id, group, trial) %>%
  summarize(
    prop_repeat = mean(is_new == "repeat", na.rm = TRUE)  # Calculate proportion of "repeat" (exploitation) choices
    # prop_new = mean(is_new == "new", na.rm = TRUE)  # Calculate proportion of "new" (exporation) choices
  )

Figure 4 shows that both Control and PD+ patients increased the amount of exploitation over time, indicating a goal-directed shift from exploring novel options to exploiting known high-value options. The Control group began exploiting earlier in the round and exhibited a stronger overall tendency toward exploitation compared to the PD+ group, indicating that this earlier focus on exploitation underlies their better performance. In stark contrast, PD- patients predominantly engaged in exploration and showed only a weak tendency towards exploitation as the search horizon approached its end. This pattern is also reflected in the overall proportion of exploitation decisions (Figure 4, inset). Control patients made more exploitation decisions than PD+ patients (\(t(65)=2.6\), \(p=.011\), \(d=0.6\), \(BF=4.2\)), who exploited more than PD- patients (\(t(62)=5.0\), \(p<.001\), \(d=1.3\), \(BF>100\)). Notably, PD- patients almost exclusively selected novels options during the task and only rarely exploited known options. These distinct behavioral patterns show how suboptimal balance of exploration and exploitation affects obtained rewards.

4.5 Exploitation choices

Code
p_explore_exploit_choices <-  ggplot(df_unique_choices_subject, aes(x = group, y = m_prop_repeat, color = group, fill = group, shape = group)) +
  geom_boxplot(alpha = 0.2, width = 0.4, outlier.shape = NA) +  
  geom_jitter(width = 0.15, size = 2) +  
  stat_summary(fun = mean, geom = "point", shape = 23, fill = "white", size = 4) +
  scale_color_manual(values = groupcolors) +
  scale_fill_manual(values = groupcolors) +
  scale_y_continuous("Exploitation choices", 
                     breaks = c(0, 0.5, 1), 
                     labels = percent_format(accuracy = 1)) +
  coord_cartesian(ylim = c(0, 1)) +
  xlab("") +
  ggtitle("Exploitation choices") +
  theme_classic() +
  theme(
    strip.background = element_blank(),  
    strip.text = element_text(color = "black", size = 12),
    legend.position = "none",
    axis.title = element_text(size = 18),
    axis.text = element_text(size = 18),
    plot.margin = margin(0, 0, 0, 0),
    plot.title = element_text(size = 24)
  )

p_explore_exploit_choices

ggsave("plots/explore_exploit_choices.png", p_explore_exploit_choices, dpi=300, width = 6, height = 5)
Figure 4: Balancing exploration and exploitation. Shown are the mean proportions of exploitation decisions, aggregated over trials and rounds. Each dot is one participant.
Code
# Main plot
p_explore_exploit_choices_over_time <- ggplot(dat_repeat_prop, aes(x = trial, y = prop_repeat, fill = group, shape = group, color = group)) +
  stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.2, alpha = 0.5, position=position_dodge(width=0.5)) +  
  stat_summary(fun = mean, geom = "point", size = 3, position=position_dodge(width=0.5)) +  
  stat_summary(fun = mean, geom = "line", position=position_dodge(width=0.5)) +  
  scale_fill_manual(values = groupcolors) + 
  scale_color_manual(values = groupcolors)+ 
  scale_y_continuous(labels = percent_format(accuracy = 1)) + 
  labs(
    x = "Trial",
    y = "Exploitation choices (M±95% CI)",
    title = "Exploration and exploitation over time"
  ) +
  theme_classic() +
  theme(strip.background = element_blank(),  
        strip.text = element_text(color = "black", size=12),
        # legend.position = c(0.01, 0.4),  
        legend.position = "inside",  
        legend.position.inside = c(0.05, 0.95),  
        legend.justification = c(0, 1),
        legend.title = element_blank(),
        axis.title = element_text(colour="black",size = 18),
        axis.text = element_text(colour="black",size = 18),
        plot.margin = margin(0, 0, 0, 0),
        plot.title = element_text(size = 24),
        legend.text = element_text(colour="black", size=18)
  )


p_explore_exploit_choices_over_time

ggsave("plots/p_explore_exploit_choices_over_time.png", dpi=300, width = 8, height = 4.5)
Figure 5: Mean proportion of exploitation decisions per trial, aggregated over rounds.
Code
# Main plot
p_main <- ggplot(dat_repeat_prop, aes(x = trial, y = prop_repeat, fill = group, shape = group, color = group)) +
  stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.2, alpha = 0.5, position=position_dodge(width=0.5)) +  
  stat_summary(fun = mean, geom = "point", size = 3, position=position_dodge(width=0.5)) +  
  stat_summary(fun = mean, geom = "line", position=position_dodge(width=0.5)) +  
  scale_fill_manual(values = groupcolors) + 
  scale_color_manual(values = groupcolors)+ 
  scale_y_continuous(labels = percent_format(accuracy = 1)) + 
  labs(
    x = "Trial",
    y = "Proportion exploitation  (M±95% CI)",
    title = "Exploration and exploitation over time"
  ) +
  theme_classic() +
  theme(strip.background = element_blank(),  
        strip.text = element_text(color = "black", size=12),
        legend.position = c(0.01, 0.4),  
        legend.justification = c(0, 1),
        legend.title = element_blank(),
        axis.title = element_text(size = 20),
        axis.text = element_text(size = 20),
        plot.margin = margin(0, 0, 0, 0),
        plot.title = element_text(size = 24),
        legend.text = element_text(colour="black", size=18)
  )

# Inset plot
p_inset <-  ggplot(df_unique_choices_subject, aes(x = group, y = m_prop_repeat, color = group, fill = group, shape = group)) +
  geom_boxplot(alpha = 0.2, size = 0.5, outlier.shape = NA) +  
  geom_jitter(width = 0.15, size = 0.8) +  
  stat_summary(fun = mean, geom = "point", shape = 23, fill = "white", size = 2) +
  scale_color_manual(values = groupcolors) +
  scale_fill_manual(values = groupcolors) +
  scale_y_continuous("", 
                     breaks = c(0, 0.5, 1), 
                     labels = percent_format(accuracy = 1)) +
  coord_cartesian(ylim = c(0, 1.25)) +
  xlab("") +
  ggtitle("   Total exploitation choices") +
  theme_classic() +
  theme(
    strip.background = element_blank(),  
    strip.text = element_text(color = "black", size = 12),
    legend.position = "none",
    axis.title.y = element_text(size = 14),
    axis.text.y = element_text(size = 14),
    plot.margin = margin(0, 0, 0, 0),
    plot.title = element_text(size = 14, margin = margin(b = -10))
  )


# Combine plots
p_explore_exploit_choices_inset <- ggdraw() +
  draw_plot(p_main) +
  draw_plot(p_inset, x = 0.14, y = 0.53, width = 0.35, height = 0.35)

# p_explore_exploit_choices_inset

# ggsave("plots/explore_exploit_choices_combined.png", dpi=300, width = 8, height = 4.5)

# ggboxplot(df_unique_choices_subject, 
#                    x = "group", 
#                    # y = "m_prop_unique",
#                    y = "m_prop_repeat",
#                    color = "group", palette = groupcolors, fill = "group", alpha = 0.2, size=0.2,
#                    add = "jitter", jitter.size = 0.0, shape = "group", title = "   Total exploitation choices") +
# scale_y_continuous("", , #"Exploration choices", 
#                    breaks = c(0, 0.5, 1), 
#                    # limits = c(0,1),
#                    labels = percent_format(accuracy = 1)) +
# coord_cartesian(ylim=c(0,1.25)) +
# xlab("") +
# ggtitle("") +
# stat_compare_means(comparisons = list( c("Control", "PD+"), c("PD+", "PD-")  ),
#                    paired = FALSE, 
#                    method = "t.test", 
#                    aes(label = paste0("p = ", after_stat(p.format)))  ) +
# stat_summary(fun = mean, geom="point", shape = 23, fill = "white", size=2) +
# theme_classic() +
# theme(strip.background = element_blank(),  
#       strip.text = element_text(color = "black", size=12),
#       legend.position = "none",
#       axis.title.y = element_text(size=8),
#       axis.text.y = element_text(size=7),
#       plot.margin = margin(0, 0, 0, 0),
#       plot.title = element_text(size = 9, margin = margin(b = -10)), 
# )

# Plot for Computational Psychiatry Conference (Tübingen, July 2025)
# Main plot
# p_main_CPP <- ggplot(dat_repeat_prop, aes(x = trial, y = prop_repeat, fill = group, shape = group, color = group)) +
#   stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.2, alpha = 0.5, position=position_dodge(width=0.5)) +  
#   stat_summary(fun = mean, geom = "point", size = 3, position=position_dodge(width=0.5)) +  
#   stat_summary(fun = mean, geom = "line", position=position_dodge(width=0.5)) +  
#   scale_fill_manual(values = groupcolors) + 
#   scale_color_manual(values = groupcolors)+ 
#   scale_y_continuous(labels = percent_format(accuracy = 1)) + 
#   labs(
#     x = "Trial",
#     y = "Exploitation choices (M±95% CI)",
#     title = "Exploration and exploitation over time"
#   ) +
#   theme_classic() +
#   theme(strip.background = element_blank(),  
#         strip.text = element_text(color = "black", size=12),
#         legend.position = c(0.01, 0.4),  
#         legend.justification = c(0, 1),
#         legend.title = element_blank(),
#         axis.title = element_text(size = 20),
#         axis.text = element_text(size = 20),
#         plot.margin = margin(0, 0, 0, 0),
#         plot.title = element_text(size = 24),
#         legend.text = element_text(colour="black", size=18)
#   )
# 
# # Inset plot
# p_inset_CPP <-  ggplot(df_unique_choices_subject, aes(x = group, y = m_prop_repeat, color = group, fill = group, shape = group)) +
#   geom_boxplot(alpha = 0.2, size = 0.5, outlier.shape = NA, width=0.75) +  
#   geom_jitter(width = 0.15, size = 0.8) +  
#   stat_summary(fun = mean, geom = "point", shape = 23, fill = "white", size = 2) +
#   scale_color_manual(values = groupcolors) +
#   scale_fill_manual(values = groupcolors) +
#   scale_y_continuous("", 
#                      breaks = c(0, 0.5, 1), 
#                      labels = percent_format(accuracy = 1)) +
#   coord_cartesian(ylim = c(0, 1.15)) +
#   xlab("") +
#   ggtitle("  Exploitation choices") +
#   theme_classic() +
#   theme(
#     strip.background = element_blank(),  
#     strip.text = element_text(color = "black", size = 12),
#     legend.position = "none",
#     axis.title.y = element_text(size = 14),
#     axis.text.y = element_text(size = 14),
#     plot.margin = margin(0, 0, 0, 0),
#     plot.title = element_text(size = 14, margin = margin(b = -10))
#   )
# 
# 
# # Combine plots
# p_explore_exploit_choices_CPP <- ggdraw() +
#   draw_plot(p_main_CPP) +
#   draw_plot(p_inset_CPP, x = 0.14, y = 0.53, width = 0.35, height = 0.35)
# 
# ggsave("plots/explore_exploit_choices_CPP.png", dpi=300, width = 7, height = 5)
# ggsave("plots/explore_exploit_choices_CPP.pdf", width = 7, height = 5)

The mean proportion of repeat choices (=exploitation) differed among all groups, with patients with polyneuropathy (Control) showing higher levels of exploitation than both PD+ and PD- patients. Parkinson patients on medication (PD+) exploited more than patients off medication (PD-).

  • Control vs. PD+: \(t(65)=2.6\), \(p=.011\), \(d=0.6\), \(BF=4.2\)
  • Control vs. PD-: \(t(63)=6.5\), \(p<.001\), \(d=1.6\), \(BF>100\)
  • PD+ vs. PD-: \(t(62)=5.0\), \(p<.001\), \(d=1.3\), \(BF>100\)

4.5.1 Rewards obtained from exploration and exploitation

We also calculated the mean reward obtained from learners’ explore and exploit choices, respectively. Figure 6 shows that both Control and PD+ patients obtained higher rewards from their exploration choices, indicating higher levels of adaptation to the structure of the environment. Similarly, for exploitative choices, PD+ and Control patients obtained higher rewards, showing that PD- patients not only exploited less frequently but also did so less efficiently.

Code
# mean reward 
df_reward_explore_exploit <- dat %>%
  filter(round %in% 2:9 & trial > 0) %>% 
  # group_by(id,group, round, is_new) %>%  
  group_by(id,group, is_new_label) %>%  
  summarize(
    n = n(),  #  number of trials
    mean_reward = mean(z)
  )

ggplot(df_reward_explore_exploit, aes(x = group, y = mean_reward, color = group, fill = group, shape = group)) +
  facet_wrap(~is_new_label) + 
  geom_boxplot(alpha = 0.2, size = 0.5, outlier.shape = NA, width=0.75) +  
  geom_jitter(width = 0.15, size = 0.8) +  
  stat_summary(fun = mean, geom = "point", shape = 23, fill = "white", size = 2) +
  scale_color_manual(values = groupcolors) +
  scale_fill_manual(values = groupcolors) +
  scale_y_continuous("Mean reward") + 
                   #breaks = c(0, 10, 20, 30, 40, 50), 
                   #labels = c("0","10","20","30","40","50"),
                   #limits = c(0, 55)) +
  xlab("") +
  ggtitle("Mean rewards for exploration and exploitation") +
  theme_classic() +
  theme(
    strip.background = element_blank(),  
    strip.text = element_text(color = "black", size = 14),
    legend.position = "none",
    axis.title = element_text(size = 14),
    axis.text = element_text(size = 14),
    # plot.margin = margin(0, 0, 0, 0),
    plot.title = element_text(size = 16)
  )

ggsave("plots/explore_exploit_choices_rewards.png", dpi=300, width = 7, height = 5)
ggsave("plots/explore_exploit_choices_rewards.pdf", width = 7, height = 5)
Figure 6: Mean rewards for explore versus exploit choices, averaged across all rounds and trials per participant. Each dot represents one participant.

Explore choices

  • Control vs. PD+: \(t(65)=1.0\), \(p=.311\), \(d=0.2\), \(BF=.39\)
  • Control vs. PD-: \(t(63)=3.4\), \(p=.001\), \(d=0.8\), \(BF=24\)
  • PD+ vs. PD-: \(t(62)=2.9\), \(p=.005\), \(d=0.7\), \(BF=7.6\)

Exploit choices

  • Control vs. PD+: \(t(60)=0.3\), \(p=.738\), \(d=0.1\), \(BF=.27\)
  • Control vs. PD-: \(t(49)=3.4\), \(p=.001\), \(d=1.0\), \(BF=24\)
  • PD+ vs. PD-: \(t(47)=3.2\), \(p=.002\), \(d=0.9\), \(BF=16\)

4.6 Spatial trajectories

4.6.1 Distance consecutive choices

We next consider participant’s spatial search trajectories (distance among consecutive clicks). Distance is measured as Manhattan distance between consecutive clicks, such that repeat clicks have distance 0, clicking directly neighbouring tiles has distance 1, and clicks further away have distances >1.

The most frequent choice was to select a neighboring tile (distance = 1), reflecting a local search approach (Wu et al., 2018). On average, Control patients had the shortest distances, indicating more local searches and repeated clicks. PD+ patients had greater distances than Control but shorter than PD- patients, who showed the highest distances. The distribution of distances shows that this is primarily due to the few repeat choices (distance = 0) they made, i.e. very limited exploitation behavior.

Control patients had lower search distances than the PD+ group (\(t(65)=-2.5\), \(p=.015\), \(d=0.6\), \(BF=3.4\)) and lower distances than the PD- group (\(t(63)=-2.3\), \(p=.024\), \(d=0.6\), \(BF=2.3\)). There was no difference between Parkinson patient with (PD+) and without (PD-) medication (\(t(62)=-0.4\), \(p=.661\), \(d=0.1\), \(BF=.28\)).

4.6.2 Types of choices

We can also categorize each consecutive click as “repeat” (clicking the same tile as in the previous round), “near” (clicking a directly neighboring tile, i.e. distance=1), or “far” (clicking a tile with distance > 1). We first computed for each participant the proportion of type of choices across all 8 rounds x 25 clicks = 200 search decisions and then plot the mean proportion for each group.

The analyses reveal distinct search patterns across patient groups. Control participants had the highest proportion of repeat (exploit) decisions, followed by the PD+ group. The proportion of repeat decisions in the PD- group was minimal. These behaviors help explain the differences in learning curves, where Control patients showed the most significant improvement, followed by PD+ patients. In contrast, PD- patients exhibited no improvement across trials, due to their lower tendency to exploit high-reward options.

Code
df_types_choices_subject <- dat %>% 
  filter(round %in% 2:9 & trial > 0) %>% 
  group_by(id, group, type_choice) %>% 
  summarise(n = n()) %>% 
  complete(type_choice, fill = list(n = 0)) %>% # turn implicit missing values into explicit missing values
  group_by(id, group) %>%
  mutate(prop = n / sum(n)) 

df_types_choices_overall <- df_types_choices_subject %>% 
  group_by(group, type_choice) %>% 
  summarise(n = n(),
            mean_prop = mean(prop),
            SD_prop = sd(prop),
            se_prop = SD_prop / sqrt(n),
            lower_ci_prop = mean_prop - qt(1 - (0.05 / 2), n - 1) * se_prop,
            upper_ci_prop = mean_prop + qt(1 - (0.05 / 2), n - 1) * se_prop)

ggplot(df_types_choices_overall) +
  facet_grid(~group) +
  geom_bar(aes(x = type_choice, y = mean_prop, fill = group, alpha = type_choice), stat = "identity", colour = 'black') +
  scale_y_continuous("Average proportion", limits = c(0,0.7), breaks = seq(0,1,.2), expand = c(0, 0), labels =   scales::percent_format(accuracy=1)) +
  scale_alpha_discrete(range = c(0.2,1)) +
  scale_x_discrete("Search decision") +
  scale_fill_manual(values=groupcolors) +
  scale_color_manual(values=groupcolors) +
  ggtitle("Types of consecutive search decisions") +
  geom_text(aes(x = type_choice, y = mean_prop, label = scales::percent(mean_prop, accuracy = 1)), vjust = -0.5, size = 3) + 
  theme_classic() +
  theme(aspect.ratio = 1,
        plot.title = element_text(hjust = 0.5, size=24),
        legend.title = element_blank(),
        legend.position = 'none',
        legend.text =  element_text(colour="black"),
        text = element_text(colour = "black"),
        strip.background = element_blank(),
        axis.text = element_text(colour="black", size=16),
        axis.title = element_text(colour="black", size=16),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank())  

ggsave("plots/distance_types.png", dpi=300,height=3, width=5)
Figure 7: Types of search decisions in terms of distance.

An analysis of consecutive choice types over time reveals clear differences in search behavior between the groups. Both Control and PD+ patients adapt their strategies as the round progresses by decreasing the number of local (distance = 1) and far (distance > 1) choices, while increasing the number of exploit decisions, indicating a shift from exploration to exploitation. Notably, the data indicate a faster shift to exploitation for Control patients compared to PD+ patients, with an earlier and stronger preference for re-selecting known high-reward options. In contrast, PD- patients show limited adaptation, with the proportions of each decision type remaining relatively stable throughout the round, aside from a slight increase in exploit decisions.

Code
df_types_choices_trial_subject <- dat %>%
  filter(round %in% 2:9 & trial > 0) %>%
  group_by(id, group, trial, type_choice) %>%
  summarise(n = n()) %>%
  complete(type_choice, fill = list(n = 0)) %>%  # 
  group_by(id, group, trial) %>%
  mutate(prop = n / sum(n)) %>%  
  ungroup()


ggplot(df_types_choices_trial_subject, aes(x = trial, y = prop, color = type_choice, group = type_choice)) +
  facet_wrap(~group) +  
  stat_summary(fun.data = mean_cl_boot, geom = "errorbar", width = 0.2, alpha = 0.5, , position=position_dodge(width=0.5)) +  
  stat_summary(fun = mean, geom = "point", size = 2, position=position_dodge(width=0.5)) +  
  stat_summary(fun = mean, geom = "line", position=position_dodge(width=0.5)) +  
  scale_fill_manual(values = choice3_colors) + 
  scale_color_manual(values = choice3_colors)+ 
  labs(
    x = "Trial",
    y = "Mean proportion (±95% CI)",
    title = "Types of choices over time",
    color = "Type choice"
  ) +
  theme_classic() +
  theme(strip.background = element_blank(),  
        strip.text = element_text(color = "black", size=12),
        legend.position = "inside", 
        legend.position.inside = c(0.15, 1),   
        legend.justification = c(1, 1),
        legend.title = element_blank(),
        legend.spacing.y = unit(0.05, 'cm'), 
        legend.key.height = unit(0.3, 'cm')  
  )    

Types of search decisions over time.

Types of search decisions over time.
Code
ggsave("plots/types_choice_by_trial.png", dpi=300, width = 8, height = 4)

4.6.3 Distance as function of previous reward

Finally, we analysed the relation between the value of a reward obtained at time \(t\) and the search distance on the subsequent trial \(t+1\). If a large reward was obtained, searchers should search more locally, while conversely, if a low reward was obtained, searchers should be more likely to search farther away.

Across all trials and rounds, search distance and previous reward were negatively correlated, indicating that participants tended to search further away following lower rewards compared to higher rewards. This relationship was stronger in Control patients (\(r=-.43\), \(p<.001\), \(BF>100\)) and PD+ patients (\(r=-.35\), \(p<.001\), \(BF>100\)) compared to PD- patients (\(r=-.17\), \(p<.001\), \(BF>100\)). These findings suggest that PD patients off medication exhibited less adaptive search behavior than those on medication and individuals with polyneuropathies.

Code
# correlation of previous reward and distance of consecutive choices, by age group and environment
# overall, ignoring within-subject factor
# dat %>% 
#   filter(trial != 0 & round %in% 2:9) %>% # exclude first (randomly revealed) tile and practice round and bonus round
#   group_by(group) %>% 
#   summarise(corTestPretty(previous_reward, distance))

# mean correlation between distance and reward obtained on previous step
# first aggregated within each round and then within each subject
# such that there is one correlation for each subject

# reward_distance_cor <- dat %>% 
#   filter(trial != 0 & round %in% 2:9) %>% # exclude first (randomly revealed) tile and practice round and bonus round
#   group_by(id, round, group) %>% 
#   summarise(cor = cor(previous_reward, distance)) %>% 
#   mutate(cor = replace_na(cor, 0)) %>%  # in some rounds subjects clicked the same tile throughout; set cor=0
#   ungroup() %>% 
#   group_by(id, group) %>% 
#   summarise(mean_cor = mean(cor))

# mean correlation between distance and reward obtained on previous step as function of group
# reward_distance_cor %>% 
#   group_by(group) %>% 
#   summarise(n = n(),
#             m_cor = mean(mean_cor),
#             SD_cor = sd(mean_cor),
#             se_cor = SD_cor / sqrt(n),
#             lower_ci_cor = m_cor - qt(1 - (0.05 / 2), n - 1) * se_cor,
#             upper_ci_cor = m_cor + qt(1 - (0.05 / 2), n - 1) * se_cor)

#plot regression lines based on raw data
# ggplot(subset(dat, trial > 0 & round %in% 2:9), aes(x = previous_reward, y = distance, color = group)) +
#   facet_wrap(~group) +  
#   geom_jitter(alpha = 0.3, width = 0.1, height = 0.1) +  
#   geom_smooth(method = "lm", formula = y ~ x, se = TRUE) +  
#   
#   ggtitle("Regression Lines for Distance by Previous Reward and Group") +
#   theme_minimal() +
#   xlab("Previous Reward") +
#   ylab("Distance")

Given the nested structure of the data, we next employed a Bayesian hierarchical regression analysis to predict search distance based on the reward obtained in the previous step, with group and their interaction as population-level (fixed) effects and subject-wise random intercepts. These analyses show that both the magnitude of reward obtained on the last step and group influence search distance. Notably, PD patients off medication (PD-) adapted their search behavior less in response to reward magnitude, while patients on medication (PD+) exhibited adaptation levels close to to the Control group.

Code
# for now, random intercepts only, Random intercept + random slope not stable
# lmer_distance_reward <- lmer(distance ~ previous_reward * group + (previous_reward + group | id), 
# data = subset(dat, trial > 0 & round %in% 2:9))
# fit model
lmer_distance_reward <- lmer(distance ~ previous_reward * group + (1 | id), 
                             data = subset(dat, trial > 0 & round %in% 2:9))

#summary(lmer_distance_reward)
#emmeans(lmer_distance_reward, pairwise ~ previous_reward | group, pbkrtest.limit = 15000)

p_lmer_distance_reward <- plot_model(lmer_distance_reward, type = "pred", terms = c("previous_reward", "group")) +
  stat_summary(dat, mapping=aes(x=previous_reward, y=distance, color=group, fill=group,shape = group), fun=mean, geom='point', alpha=0.7, size=1, na.rm = TRUE)+
  # scale_x_continuous('Previous Reward', breaks = (c(0,10,20,30,40,50))) +
  # scale_x_continuous('Previous Reward', breaks = (c(0,0.210,20,30,40,50))) +
  ylab('Distance to Next Option')+
  scale_fill_manual(values=groupcolors) +
  scale_color_manual(values=groupcolors) +
  ggtitle('Search Distance ~ Previous Reward (lmer)') +
  theme_classic() +
  theme(legend.position = "inside", 
        legend.position.inside = c(0.85, 0.9),   # Use legend.position.inside
        legend.justification = c(1, 1),
        legend.title = element_blank(),
        legend.box.background =  element_blank(),
        legend.key = element_rect(fill = "white"),
         axis.text = element_text(colour = "black", size = 14),
    axis.title = element_text(colour = "black", size = 14),) +
  guides(color = guide_legend(
    override.aes = list(
      fill = NA, 
      size = 2
    )))

# p_lmer_distance_reward$layers[[2]]$show.legend <- FALSE
# p_lmer_distance_reward

ggsave("plots/regression_distance_reward_lmer.png", p_lmer_distance_reward, dpi=300, height=5, width=7)
Code
# Bayesian regression analysis
# run_model() is a wrapper for brm models such that it saves the full model the first time it is run, otherwise it loads it from disk from directory `~brm`
# Fixed effects: previous_reward and group.
# Random effects: random slopes and a random intercept for both previous_reward and group by id, i.e., the effect of previous_reward and group can vary across individuals (id).

# random intercept and random slope
# brm_distance_reward <- run_model(brm(distance ~ previous_reward * group + (previous_reward + group | id), 

# random intercept                                     
brm_distance_reward <- run_model(brm(distance ~ previous_reward * group + (1|id),
                                     data=subset(dat, trial > 0 & round %in% 2:9 ),
                                     cores=4,
                                     chains=4,
                                     backend = "cmdstanr",
                                     seed = 0815,
                                     iter = 5000,
                                     warmup=1000,
                                     control = list(adapt_delta = 0.99, max_treedepth = 15)),
                                 #prior = prior(normal(0,10), class = "b")),
                                 modelName = 'brm_distance_reward')
#tab_model(brm_distance_reward, bpe="mean", title = "Bayesian regression results: Search distance as function of reward on previous step.") 
#bayes_R2(brm_distance_reward) 

# generate plot manually  predictions (otherwise difficult to plot the mean empirical values per geom_point)
# prevReward <-  seq(-3,55) / 50 # normalized reward

prevReward <-  seq(round(min(dat$previous_reward, na.rm=T),1),round(max(dat$previous_reward, na.rm=T),1), 0.1) # normalized reward

group  <-  levels(dat$group)
newdat <-  expand.grid(previous_reward=prevReward, group=group)

# predict distance based on previous reward
preds <-  fitted(brm_distance_reward, re_formula=NA, newdata=newdat, probs=c(.025, .975))
predsDF <-  data.frame(previous_reward=rep(prevReward, 3),
                       group=rep(levels(dat$group), each=length(prevReward)),
                       distance=preds[,1],
                       lower=preds[,3],
                       upper=preds[,4])

# average distance
grid  <-  expand.grid(x1=0:7, x2=0:7, y1=0:7, y2=0:7)
grid$distance <-  NA

for(i in 1:dim(grid)[1]){
  grid$distance[i] <- dist(rbind(c(grid$x1[i], grid$x2[i]), c(grid$y1[i], grid$y2[i])), method = "manhattan")
}

meanDist  <-  mean(grid$distance)

# plot predictions
p_regression_distance_reward <- 
  ggplot() +
  stat_summary(dat, mapping=aes(x=previous_reward, y=distance, color=group, fill=group), fun=mean, geom='point', alpha=0.7, size=1, na.rm=T)+
  geom_line(predsDF, mapping=aes(x=previous_reward, y=distance, color=group), linewidth=1) +
  geom_ribbon(predsDF, mapping=aes(x=previous_reward, y=distance, ymin=lower, ymax=upper, fill=group), alpha=.3) +
  #geom_hline(yintercept=meanDist, linetype='dashed', color='red') + # mean distance
  # xlab('Normalized Previous Reward')+
  coord_cartesian(ylim= c(0,5), xlim= c(-0.01,1)) +
  scale_x_continuous(
    name = "Previous normalized reward",
    breaks = seq(0, 1, 0.1),
    labels = sprintf("%.1f", seq(0, 1, 0.1))) +
  ylab('Distance to next chosen option')+
  scale_fill_manual(values=groupcolors) +
  scale_color_manual(values=groupcolors) +
  labs(
  title = "Search Distance ~ Previous Reward",
  #subtitle = "(Bayesian hierarchical regression)"
  ) +
  theme_classic() +
  theme(legend.position = "inside", 
        legend.position.inside = c(0.85, 0.9),   
        legend.justification = c(1, 1),
        legend.title = element_blank(),
        legend.text = element_text(colour = "black", size = 18),
        plot.title = element_text(colour = "black", size = 24),
        plot.subtitle = element_text(colour = "black", size = 18),
        axis.text = element_text(colour = "black", size = 18),
        axis.title = element_text(colour = "black", size = 18)
        )        

p_regression_distance_reward
ggsave("plots/regression_distance_reward_brms.png", p_regression_distance_reward, dpi=300, height=5, width=7)
ggsave("plots/regression_distance_reward_brms.pdf", p_regression_distance_reward, height=5, width=7)

# mean values vy previosu reward and group

# df_summary <- dat %>%
#   group_by(previous_reward, group) %>%
#   summarise(
#     mean_distance = mean(distance, na.rm = TRUE),
#     .groups = "drop"
#   )

# Plot for Computational Psychiatry Conference (Tübingen, July 2025)
# ggplot() +
#   stat_summary(dat, mapping=aes(x=previous_reward, y=distance, color=group, fill=group), fun=mean, geom='point', alpha=0.7, size=1, na.rm=T)+
#   geom_line(predsDF, mapping=aes(x=previous_reward, y=distance, color=group), linewidth=1) +
#   geom_ribbon(predsDF, mapping=aes(x=previous_reward, y=distance, ymin=lower, ymax=upper, fill=group), alpha=.3) +
#   #geom_hline(yintercept=meanDist, linetype='dashed', color='red') + # mean distance
#   scale_x_continuous('Previous Reward', breaks = seq(0,50,10), labels = c(0,10,20,30,40,50)) +
#   scale_y_continuous('Distance to next choice', breaks = seq(0,7,1), labels = c(0,1,2,3,4,5,6,7))+ 
#   ylab('Distance to next choice') +
#   coord_cartesian(ylim= c(0,5)) +
#   ylab('Distance to next choice')+
#   scale_fill_manual(values=groupcolors) +
#   scale_color_manual(values=groupcolors) +
#   labs(
#   title = "Search Distance ~ Previous Reward",
#   subtitle = "(Bayesian hierarchical regression)") +
#   theme_classic() +
#   theme(legend.position = "inside", 
#         legend.position.inside = c(0.85, 0.9),   
#         legend.justification = c(1, 1),
#         legend.title = element_blank(),
#         legend.text = element_text(colour = "black", size = 18),
#         plot.title = element_text(colour = "black", size = 22),
#         plot.subtitle = element_text(colour = "black", size = 18),
#         axis.text = element_text(colour = "black", size = 18),
#         axis.title = element_text(colour = "black", size = 18)
#         )          
# 
# 
# ggsave("plots/regression_distance_reward_brms_CPP.png", dpi=300, height=5, width=6)
# ggsave("plots/regression_distance_reward_brms_CPP.pdf", height=5, width=6)
Figure 8: Bayesian hierarchical regression analysis with search distance as function of previous reward. Dots are the empirical mean distances for each reward value, aggregated over participants, trials, and rounds.

4.7 Bonus round judgments

In the bonus round, participants made 15 search decisions and then predicted the rewards for 5 randomly chosen, previously unobserved tiles. Subsequently, they chose one of the five tiles and continued the round until the search horizon of 25 clicks was met.

Data frame dat_bonus contains the following variables:

  • id: participant id
  • bonus_env_number: internal id of the bonus round environment
  • bonus_environment: recodes condition as Smooth (high spatial correlation)
  • x and y are the coordinates of the random tiles on the grid for whcih participants were asked to provide reward estimates
  • givenValue: participant reward judgment (scale 0-50)
  • howSecure: participant confidence for given reward judgment (scale 0-10)
  • chosen_x and chosen_y are the coordinates of the tile chose after making reward and confidence judgments for 5 random tiles
  • true_z is the ground truth, i.e. true expected reward of a tile
  • error is the absolute deviation between participants reward estimates (givenValue) and ground truth (true_z)
  • chosen is whether the option was chosen or not (participants chose one of the five options after estimating their value and confidence in their reward prediction)
Note

Charley: is this scaling still correct (taken from YKWG code)? bonus_environment$z <- bonus_environment$z * scale_factor + 5

Table 3: Bonus round data.
id bonus_env_number bonus_environment x y givenValue howSecure chosen_x chosen_y true_z chosen error group
111 38 Rough 5 6 20 5 7 3 16.34 not chosen 3.66 Control
111 38 Rough 2 7 26 4 7 3 16.16 not chosen 9.84 Control
111 38 Rough 7 3 16 5 7 3 38.27 chosen 22.27 Control
111 38 Rough 0 7 28 3 7 3 23.99 not chosen 4.01 Control
111 38 Rough 7 6 30 5 7 3 34.10 not chosen 4.10 Control
115 39 Rough 0 0 19 4 3 1 25.86 not chosen 6.86 Control

4.7.1 Prediction error

Figure 9 shows the mean absolute error between participants’ estimates and the true underlying expected reward, for each age group and environment. Compared to a random baseline, all groups performed better than chance level:

  • Control: \(t(32)=-14.7\), \(p<.001\), \(d=2.6\), \(BF>100\)
  • PD+: \(t(28)=-9.6\), \(p<.001\), \(d=1.8\), \(BF>100\)
  • PD-: \(t(25)=-8.1\), \(p<.001\), \(d=1.6\), \(BF>100\)

There were no difference between groups:

  • Control vs. PD+: \(t(60)=-1.2\), \(p=.221\), \(d=0.3\), \(BF=.49\)
  • Control vs. PD-: \(t(57)=-2.6\), \(p=.012\), \(d=0.7\), \(BF=4.1\)
  • PD+ vs. PD-: \(t(53)=-1.2\), \(p=.249\), \(d=0.3\), \(BF=.48\)
Figure 9: Prediction error of bonus round judgments. The red dotted line indicates a random baseline.

4.7.2 Prediction error and confidence

Code
# Across all judgments and participants, there was no systematic relation between confidence and prediction error:
# corTestPretty(dat_bonus$error, dat_bonus$howSecure, method = "kendall") 
# cor.test(dat_bonus$error, dat_bonus$howSecure, method = "kendall") 
# correlationBF(dat_bonus$error, dat_bonus$howSecure, method = "kendall") 

A Bayesian regression with with prediction error as dependent variable, and confidence and group and their interaction as population-level (“fixed”) effects, and a random intercept for participants showed that for Control patients confidence and predictione error were negatively correlated (i.e., lower confidence was associated with a higher error), whereas for the two Parkinson groups there was no relation.

Code
brm_bonus_confidence_error_by_group <- run_model(brm(error ~ howSecure * group + (1|id), 
                                                     data=dat_bonus, 
                                                     cores=4,  
                                                     chains=4,
                                                     backend = "cmdstanr",
                                                     control = list(adapt_delta = 0.99),
                                                     iter = 5000,
                                                     warmup=1000, 
                                                     seed = 0815), 
                                                 modelName = 'brm_bonus_confidence_error_by_group')
#tab_model(brm_bonus_confidence_error_by_group, bpe = "mean", title = "Bayesian regression results: Prediction error and confidence") 
#bayes_R2(brm_bonus_confidence_error_by_group)

4.7.3 Analysis of selected tiles

To analyze selected and not-selected options, we first averaged the predicted reward and confidence of the not-chosen tiles within subjects, and then compared chosen and not chosen options. Selected tiles tended to have higher predicted rewards. Participants were not more confident in selected options, and selected tiles did not have a higher true reward than not selected tiles.

Code
# average not-chosen tiles within subjects first
df_chosen_overall <- dat_bonus %>% 
  group_by(id, chosen) %>% 
  summarise(m_givenValue = mean(givenValue),
            m_howSecure = mean(howSecure),
            m_true_z = mean(true_z))

# df_chosen_overall %>% 
#   group_by(chosen) %>% 
#   summarise(predicted_reward = mean(m_givenValue),
#             confidence = mean(m_howSecure),
#             true_reward = mean(m_true_z)) %>% 
#   kable(format = "html", escape = F, digits = 2) %>%
#   kable_styling("striped", full_width = F)

df_chosen_group <- dat_bonus %>% 
  group_by(id, group, chosen) %>% 
  summarise(m_givenValue = mean(givenValue),
            m_howSecure = mean(howSecure),
            m_true_z = mean(true_z))

# df_chosen_group %>% 
#   group_by(group,chosen) %>% 
#   summarise(predicted_reward = mean(m_givenValue),
#             confidence = mean(m_howSecure),
#             true_reward = mean(m_true_z)) %>% 
#   kable(format = "html", escape = F, digits = 2) %>%
#   kable_styling("striped", full_width = F)

# chosen vs not chosen: predicted reward

ggboxplot(df_chosen_group, x = "chosen", y = "m_givenValue",
          color = "group", palette =groupcolors, fill = "group", alpha = 0.2,
          add = "jitter", shape = "group", title = "Predicted reward of chosen vs. not chosen options",
          facet.by = "group") +
  ylab("Predicted reward") +
  xlab("") +
  stat_compare_means(comparisons = list( c("chosen", "not chosen") ), paired = T, method = "t.test", label = "p.format") +
  stat_summary(fun = mean, geom="point", shape = 23, fill = "white", size=3) +
  theme_classic() +
  theme(strip.background = element_blank(),  
        strip.text = element_text(color = "black", size=12),
        legend.title = element_blank()
  )

ggsave("plots/bonusround_chosen_not_chosen_options_predicted_reward.png", dpi=300, width=7, height = 5)


ggsave("plots/bonusround_chosen_not_chosen_options_confidence.png", dpi=300, width=7, height = 5)
Figure 10: Predicted reward of chosen vs. not chosen options in bonus round.
  • relate difference confidence chosen not chosen relate to individual beta
Code
# chosen vs not chosen: confidence
ggboxplot(df_chosen_group, x = "chosen", y = "m_howSecure",
          color = "group", palette =groupcolors, fill = "group", alpha = 0.2,
          add = "jitter", shape = "group", title = "Confidence of chosen vs. not chosen options",
          facet.by = "group") +
  ylab("Confidence in reward prediction") +
  xlab("") +
  stat_compare_means(comparisons =  list( c("chosen", "not chosen") ), paired = T, method = "t.test", label = "p.format") +
  stat_summary(fun = mean, geom="point", shape = 23, fill = "white", size=3) +
  theme_classic() +
  theme(strip.background = element_blank(),  
        strip.text = element_text(color = "black", size=12),
        legend.title = element_blank()
  )

ggsave("plots/bonusround_chosen_not_chosen_options_confidence.png", dpi=300, width=7, height = 5)
Figure 11: Confidence in reward prediction of chosen vs. not chosen options in bonus round.

5 Appendix

5.1 Overview figure behavioral results

Code
cowplot::plot_grid(p_performance, p_learning_curves, p_explore_exploit_choices, p_explore_exploit_choices_over_time, p_distance_consecutive, p_regression_distance_reward,  labels = c('A', 'B','C', 'D', 'E', 'F'), label_size = 20, nrow = 3)

cowplot::plot_grid(
  p_performance, p_learning_curves,
  p_explore_exploit_choices, p_explore_exploit_choices_over_time,
  p_distance_consecutive, p_regression_distance_reward,
  labels = c("AUTO"),
  label_size = 22,
  nrow = 3,
  rel_widths = c(0.45, 0.55),
  align = "hv"
)

ggsave("plots/behavioral_results.png", dpi=300, height=16, width=15)
ggsave("plots/behavioral_results.pdf", height=16, width=15)
Figure 12: Behavioral results.
Figure 13: Behavioral results.

5.2 Distribution of BDI, MMSE, and YH in each group

Code
# dotplot BDI
# p_dotplot_BDI <- 
ggplot(dat_sample, aes(x = BDI, fill = group)) +
  facet_wrap(~group) +
  geom_dotplot(binwidth = 1, dotsize = 1) +
  scale_fill_manual(values = groupcolors) +
  scale_x_continuous("BDI score") + 
  scale_y_continuous(NULL, breaks = NULL) + 
  coord_fixed(ratio = 15) +
  theme_classic() +
  theme(
    legend.title = element_blank(),
    legend.position = 'none',
    strip.text = element_text(size=14),
    legend.text =  element_text(colour="black"),
    text = element_text(colour = "black"),
    strip.background =element_blank(),
    axis.text.x = element_text(colour="black"),
    axis.text.y = element_text(colour="black"),
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank())

Code
# p_dotplot_MMSE <- 
ggplot(dat_sample, aes(x = MMSE, fill = group)) +
  facet_wrap(~group) +
  geom_dotplot(binwidth = 1, dotsize = 1) +
  scale_fill_manual(values = groupcolors) +
  scale_x_continuous("MMSE score") + 
  scale_y_continuous(NULL, breaks = NULL) + 
  coord_fixed(ratio = 15) +
  theme_classic() +
  theme(plot.title = element_text(hjust = 0.5, size = 10),
        legend.title = element_blank(),
        legend.position = 'none',
        legend.text =  element_text(colour="black"),
        text = element_text(colour = "black"),
        strip.background =element_blank(),
        axis.text.x = element_text(colour="black"),
        axis.text.y = element_text(colour="black"),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank())

Code
p_dotplot_HY <- ggplot(filter(dat_sample, group != "Control"), aes(x = hoehn_yahr, fill = group)) +
  facet_wrap(~group) +
  geom_dotplot(binwidth = 1, dotsize = 1) +
  scale_fill_manual(values = groupcolors) +
  scale_x_continuous("Hoehn-Yahr score") + 
  scale_y_continuous(NULL, breaks = NULL) + 
  coord_fixed(ratio = 15) +
  theme_classic() +
  theme(plot.title = element_text(hjust = 0.5, size = 10),
        legend.title = element_blank(),
        legend.position = 'none',
        legend.text =  element_text(colour="black"),
        text = element_text(colour = "black"),
        strip.background =element_blank(),
        axis.text.x = element_text(colour="black"),
        axis.text.y = element_text(colour="black"),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank())

5.3 Performance as function of BDI, MMSE, and Hoehn-Yahr

5.3.1 Performance as function of depression score (BDI-II)

The plots show performance as function of depression score (BDI-II), separately for each group. Opposing trends were found in the different groups: for patients with polyneuropathy (Control), there was a negative relation such that patients with higher depression scores obtained lower rewards. For the two Parkinson groups, the relation was positive, such that patients reporting more severe symptoms obtained higher rewards.

Code
df_correlation_groups_BDI <- df_mean_reward_subject %>%
  group_by(group) %>%
  summarise(correlation = cor(BDI, mean_reward, use = "complete.obs"))


ggplot(df_mean_reward_subject, aes(x = BDI, y = mean_reward)) +
  facet_wrap(~group,  scales = "free_x") +
  #geom_hline(data=filter(df_random_performance, environment=="Rough"), linetype="dotted", aes(yintercept=z_learn_envs)) + 
  #geom_hline(data=filter(df_random_performance, environment=="Smooth"), linetype="dotted",aes(yintercept=z_learn_envs)) +
  geom_smooth(colour = "black", linetype = "dashed", linewidth = 0.5, method="lm", se=T, alpha = 0.2) +
  #geom_jitter(aes(fill = BDI), shape = 21, alpha = jit_alpha, colour = "black") +
  geom_point(aes(fill = BDI), shape = 21, alpha = jit_alpha, colour = "black") +
  scale_fill_gradient(low="blue",high="red") +
  scale_y_continuous("Mean reward") + 
  scale_x_continuous("BDI score", breaks = c(0,5,10,15)) +
  coord_cartesian(xlim = c(0,15), ylim = c(20,40)) +
  ggtitle("Performance as function of depression score") +
  geom_text(data = df_correlation_groups_BDI, aes(x = 2, y = 20, label = paste0("italic(r) == ", round(correlation, 2))), parse = TRUE, inherit.aes = FALSE, size = 4) +
  theme_classic() +
  theme(aspect.ratio = 1,
        plot.title = element_text(size =16),
        legend.title = element_blank(),
        legend.position = 'none',
        legend.text =  element_text(colour="black"),
        strip.background = element_blank(),
        strip.text = element_text(size=14),
        axis.text = element_text(colour = "black"),
        axis.text.x = element_text(size=12),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank())

ggsave("plots/performance_groups_BDI.png", dpi=300, height = 3, width = 7)
Figure 14: Performance as function of depression score (BDI-II).

5.3.2 Performance as function of Mini-Mental State Examination (MMSE)

Higher values indicate better cognitive functioning.

Code
df_correlation_groups_MMSE <- df_mean_reward_subject %>%
  group_by(group) %>%
  summarise(correlation = cor(MMSE, mean_reward, use = "complete.obs"))

ggplot(df_mean_reward_subject, aes(x = MMSE, y = mean_reward)) +
  facet_wrap(~group,  scales = "free_x") +
  geom_smooth(colour = "black", linetype = "dashed", linewidth = 0.5, method="lm", se=T, alpha = 0.2) +
  geom_point(aes(fill = MMSE), shape = 21, alpha = jit_alpha, colour = "black") +
  scale_fill_gradient(low="blue",high="red") +
  scale_y_continuous("Mean reward") + 
  scale_x_continuous("MMSE score", breaks = c(27,28,29,30)) +
  coord_cartesian(xlim = c(27,30), ylim = c(23,40)) +
  ggtitle("Performance as function of mental status (MMSE)") +
  geom_text(data = df_correlation_groups_MMSE, aes(x = 27.5, y = 23, label = paste0("italic(r) == ", round(correlation, 2))), parse = TRUE, inherit.aes = FALSE, size = 3) +
  theme_classic() +
  theme(aspect.ratio = 1,
        plot.title = element_text(size=16),
        legend.title = element_blank(),
        legend.position = 'none',
        legend.text =  element_text(colour="black"),
        strip.background = element_blank(),
        strip.text = element_text(size=14),
        axis.text = element_text(colour = "black"),
        axis.text.x = element_text(size=12),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank())

ggsave("plots/performance_groups_MMSE.png", dpi=300, height = 3, width = 7)
Figure 15: Performance as function of Mini-Mental State Examination (MMSE).

5.3.3 Performance as function of Hoehn-Yahr (Parkinson patients only)

The Hoehn-Yahr scale provides basic information about the severity of motor impairments in Parkinson’s disease, with higher scores indicating greater severity.

Code
# separately for each group
df_correlation_groups_hoehn_yahr <- df_mean_reward_subject %>%  
  filter(group != "Control") %>% 
  group_by(group) %>%
  summarise(correlation = cor(hoehn_yahr, mean_reward, use = "complete.obs"))

ggplot(filter(df_mean_reward_subject, group != "Control"), aes(x = hoehn_yahr, y = mean_reward)) +
  facet_wrap(~group,  scales = "free_x") +
  #geom_hline(data=filter(df_random_performance, environment=="Rough"), linetype="dotted", aes(yintercept=z_learn_envs)) + 
  #geom_hline(data=filter(df_random_performance, environment=="Smooth"), linetype="dotted",aes(yintercept=z_learn_envs)) +
  geom_smooth(colour = "black", linetype = "dashed", size = 0.5, method="lm", se=T, alpha = 0.2) +
  geom_jitter(aes(fill = MMSE), shape = 21, alpha = jit_alpha, jit_width=0.0001, colour = "black") +
  #geom_point(aes(fill = hoehn_yahr), shape = 21, alpha = jit_alpha, colour = "black") +
  scale_fill_gradient(low="blue",high="red") +
  scale_y_continuous("Mean reward") + 
  scale_x_continuous("Hoehn-Yahr score", breaks = c(1,2,3)) +
  coord_cartesian(xlim = c(1,3), ylim = c(23,40)) +
  ggtitle("Performance as function of Hoehn-Yahr score") +
  geom_text(data = df_correlation_groups_hoehn_yahr, aes(x = 1.2, y = 23, label = paste0("italic(r) == ", round(correlation, 2))), parse = TRUE, inherit.aes = FALSE) +
  theme_classic() +
  theme(aspect.ratio = 1,
        plot.title = element_text(size=16),
        legend.title = element_blank(),
        legend.position = 'none',
        legend.text =  element_text(colour="black"),
        strip.background = element_blank(),
        strip.text = element_text(size=14),
        axis.text = element_text(colour = "black"),
        axis.text.x = element_text(size=12),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank())

Performance as function of Hoehn-Yahr scale (Parkinson patients only).

Performance as function of Hoehn-Yahr scale (Parkinson patients only).
Code
ggsave("plots/performance_groups_HY.png", dpi=300, height = 3, width = 5)

5.3.4 Reading list

  • Frank et al. (2004)
    Cognitive reinforcement learning in parkinsonism
    Demonstrated that Parkinson’s patients off medication learn poorly from positive feedback, implicating dopamine in reward learning and adaptive action selection.
    Frank et al. (2004): Cognitive reinforcement learning in parkinsonism. Science, 306, 1940–1943

  • Daw et al. (2006)
    Cortical substrates for exploratory decisions in humans
    Showed that exploratory decisions activate frontopolar cortex and are modulated by uncertainty, supporting a neural mechanism for explore–exploit arbitration.
    Daw et al. (2006): Cortical substrates for exploratory decisions in humans. Nature, 441, 876–879

  • Cohen et al. (2007)
    How the brain manages the exploration–exploitation tradeoff
    Proposed a theoretical framework linking locus coeruleus–norepinephrine function to exploration, via regulation of cortical neural gain.
    Cohen et al. (2007): How the brain manages the exploration–exploitation dilemma. Phil. Trans. R. Soc. B, 362, 933–942

  • Frank et al. (2009)
    Prefrontal and striatal dopaminergic genes predict individual differences in exploration and exploitation
    Found that genetic variation in dopamine pathways predicts differences in both uncertainty-directed and random exploration.
    Frank et al. (2009): Dopaminergic genes predict individual differences in exploration and exploitation. Nat Neurosci, 12, 1062–1068

  • Kayser et al. (2015)
    Dopamine, locus of control, and the exploration–exploitation tradeoff
    Showed that dopamine synthesis capacity interacts with locus of control to shape exploration strategies.
    Kayser et al. (2015): Dopamine, locus of control, and the exploration–exploitation tradeoff. Neuropsychopharmacology, 40, 454–462

  • Addicott et al. (2017)
    A primer on the explore/exploit trade-off for psychiatry
    Reviewed relevance of explore/exploit mechanisms to psychiatric disorders and their utility in clinical research.
    Addicott et al. (2017): A primer on the explore/exploit trade-off for psychiatry. Neuropsychopharmacology, 42, 1931–1939

  • Gershman & Tzovaras (2018)
    Dopaminergic genes are associated with both directed and random exploration
    Found that polymorphisms in dopaminergic genes relate selectively to directed exploration, not random variability.
    Gershman & Tzovaras (2018): Dopaminergic genes and exploration. Neuropsychologia, 120, 97–104

  • Cinotti et al. (2019)
    Dopamine blockade impairs the exploration–exploitation trade-off in rats
    Provided causal evidence that dopamine D2 receptor antagonism reduces exploratory choices in a multi-armed bandit task.
    Cinotti et al. (2019): Dopamine blockade impairs the explore–exploit trade-off. Sci Rep, 9, 6770

  • Chakroun et al. (2020)
    Dopaminergic modulation of the exploration/exploitation trade-off in human decision-making
    Showed that L-DOPA boosts uncertainty-directed exploration but not random exploration in humans.
    Chakroun et al. (2020): Dopaminergic modulation of exploration. eLife, 9, e51260

  • Cremer et al. (2023)
    Disentangling the roles of dopamine and noradrenaline in the exploration–exploitation trade-off
    Found that dopamine primarily modulates directed exploration, while noradrenaline affects choice variability.
    Cremer et al. (2023): Disentangling roles of dopamine and noradrenaline in explore/exploit. Neuropsychopharmacology, 48, 1078–1086

  • Chen et al. (2024)
    Dopamine and norepinephrine mediate the explore/exploit trade-off in humans
    Combined pupillometry, pharmacology, and fMRI to show distinct roles of dopamine and norepinephrine in exploration and exploitation.
    Chen et al. (2024): Dopamine and norepinephrine mediate explore/exploit tradeoff. J Neurosci, 44

References

Abbott, A. (2010). Levodopa: The story so far. Nature, 466(7310), S6–S7.
Beck, A. T., Steer, R. A., Brown, G. K., et al. (1996). Beck depression inventory.
Folstein, M. F., Folstein, S. E., & McHugh, P. R. (1975). “Mini-mental state”: A practical method for grading the cognitive state of patients for the clinician. Journal of Psychiatric Research, 12(3), 189–198.
Giron, A. P., Ciranka, S., Schulz, E., Bos, W. van den, Ruggeri, A., Meder, B., & Wu, C. M. (2023). Developmental changes in exploration resemble stochastic optimization. Nature Human Behaviour, 7(11), 1955–1967. https://doi.org/https://doi.org/10.1038/s41562-023-01662-1
Goetz, C. G., Poewe, W., Rascol, O., Sampaio, C., Stebbins, G. T., Counsell, C., Giladi, N., Holloway, R. G., Moore, C. G., Wenning, G. K., et al. (2004). Movement disorder society task force report on the hoehn and yahr staging scale: Status and recommendations the movement disorder society task force on rating scales for parkinson’s disease. Movement Disorders, 19(9), 1020–1028.
Hautzinger, M., Keller, F., & Kühner, C. (2006). Beck depressions-inventar (BDI-II). Harcourt Test Services.
Hoehn, M. M., & Yahr, M. D. (1967). Parkinsonism: Onset, progression, and mortality. Neurology, 17(5), 427–427.
Meder, B., Wu, C. M., Schulz, E., & Ruggeri, A. (2021). Development of directed and random exploration in children. Developmental Science, 24(4), e13095. https://doi.org/https://doi.org/10.1111/desc.13095
Sadeghiyeh, H., Wang, S., Alberhasky, M. R., Kyllo, H. M., Shenhav, A., & Wilson, R. C. (2020). Temporal discounting correlates with directed exploration but not with random exploration. Scientific Reports, 10(1), 4020.
Schulz, E., Wu, C. M., Ruggeri, A., & Meder, B. (2019). Searching for rewards like a child means less generalization and more directed exploration. Psychological Science, 30(11), 1561–1572. https://doi.org/10.1177/0956797619863663
Tambasco, N., Romoli, M., & Calabresi, P. (2018). Levodopa in parkinson’s disease: Current status and future developments. Current Neuropharmacology, 16(8), 1239–1252.
Wu, C. M., Schulz, E., Speekenbrink, M., Nelson, J. D., & Meder, B. (2018). Generalization guides human exploration in vast decision spaces. Nature Human Behaviour, 2, 915–924. https://doi.org/10.1038/s41562-018-0467-4